NYSED A-Series WWW Server Version 2.0P CGI Interface
COMS GATEWAY INTERFACE (CGI) OVERVIEW		May 25, 1999

This document describes the overall design and function of the CGI
Interface for web administrators.  Note that there is a CGI Library,
CGILIB, available to dramatically simplify the necessary code within
the CGI applications.  The CGILIB and its usage is described in a
seperate document and will be of interest to both administrators and
application programmers.


CGI DIAGRAM:

                        -------------------------------------------
                        |                                         |
                        |              C  O  M  S                 |
                        |        ......................>.         |
                        ---------:----------------------:----------
                            (COMS SEND)                 :
                                 :                      :
                         ---------------            (Envronment
                         | WWW Server  |             Information)
                         ---------------                :
                                 |                      V
 --------------          ---------------         -------------------
 |    Web     |          |   Request   |  CGI    |CGI|   COMS      |
 |  Client    |<=TCP/IP=>|   Handler   |<=PORT==>|LIB|   CGI       |
 | (Browser)  |          |             |  FILE   |   | Application |
 --------------          ---------------         -------------------

OVERVIEW:

The CGI interface works as follows:

 1.	The web server request handler receives a request and
	determines that the request matches the COMS ScriptAlias from
	the WWWSERV/CONFIG file.

 2.	The request handler reads the WWWSERV/COMS/CONFIG file to find
	a match for the "application" node of the URL.  If a match is
	found, the request handler gets the CGI app's agenda name,
	window name, timeout in seconds, and whether the request
	contents are to be passed as text (EBCDIC) or binary to the
	CGI app.

 3.	The request handler then creates a COMS message to be sent
	to the CGI app.  The COMS message contains "Environment
	Variables" for the request in the same manner as environment
	variables for a UNIX based CGI app.  One variable, "Port", is
	the CGI port file name to be used by the CGI app for
	reading/writing from/to the web client via the request
	handler's CGI port.  The CGI port name is unique to each
	request so that any CGI Application that has gotten
	out of sync cannot cause trouble for subsequent requests.
	Two variables, HEADER_LENGTH and CONTENT_LENGTH, indicate the
	number of bytes in the request header and request contents so
	that the CGI app can correctly read them in from the CGI port
	file.

	The HTTP/1.1 specification allows request contents to be
	'chunked'.  The CGILIB will detect chunked input and handle it.
	(More on chunking below.)

 4.	The request handler causes a CGI_SEND event to notify the
	Server to do a COMS SEND of the environment message on behalf
	of the request handler.

 5.	COMS passes the message to the input queue of the CGI app and
	fires up the application if necessary.

 6.	The CGI app does a RECEIVE to get the environment message.
	The format of the environment message is listed below.

 7.*	The CGI app sets the CGI port file name to the value received
	in the COMS message, opens the CGI port file, and then reads in
	the request header and, if it is a POST request, it reads the
	contents as well.  (Note that if the CGI app does not open the
	CGI port file within the timeout configured in the
	WWWSERV/COMS/CONFIG file the request handler will return a
	"501 Service Unavailable" result to the Client.)

8.*	The request contents (POST method) or the QUERY_STRING (GET
	method) may be urlencoded and it is up to the CGI application
	to do the decoding.  Note the the CGILIB provides the routine
	to accomplish this.

 9.*	The CGI app now has all information pertaining to the request:
	(1) the environment variables from the COMS message, (2) the
	request header from the client, and (3) the contents from the
	client.  At this point the CGI app must sift through all the
	information and extract those parts that directly pertain to
	the current transaction.

10.	The CGI app then processes the transaction whether it be an
	inquiry or update transaction.

11.*	The CGI app then writes one or more response header fields to
	the CGI port to indicate to the request handler what type of
	response is being returned.  The response header fields are
	included in the full response header that is returned to the
	client.  Next the CGI app writes a "blank line" consisting of
	a carriage return and line feed (standard HTTP) to terminate
	the response header.  Now, if there are contents to be returned,
	the CGI app writes the contents to the CGI port file.  The CGI
	app is responsible for generating correct HTML code if the
	response is an HTML document.

12.	While the CGI app is writing the response to the CGI port file,
	the request handler creates a full response header, sends it
	back to the client and writes the contents, if any, to the
	client as they are received from the CGI app.

13.	The CGI app then closes the port file to indicate to the
	request handler the end of the response.

11.	The request handler closes the client connection when all
	contents have been sent.

An * above indicates a step where the CGILIB should be used to
simplify the CGI app.

CGI Features:

The CGI app has access to the complete request header and is
not restricted by COMS message sizes.  This means the Server
does not need to be modified for new header fields that may
be defined in the future.
The COMS/CONFIG file can specify whether the CGI app wants the
request and response contents translated between ASCII and EBCDIC.
The Server guarantees that a response is sent to the client
even if the CGI app is not able to respond.


CGI COMS ENVIRONMENT MESSAGE LAYOUT:

CHARS 1-5	LIT "CGI: "
CHARS 6-9	CGI version (ie: "0.10") 
CHAR  10	<LF>
CHARS 11-16	LIT "PORT: "
CHARS 17-24	PORT FILE NAME (9999W999) REQ HAND MIX#, "W", SEQ#
CHARS 25	<LF>
CHARS 26-36	LIT "PATH_INFO: "
CHARS 37-?	URL FOLLOWING CGI SCRIPT NAME (GOOD TO USE AS A TRANCODE)
CHAR  ?		<LF>
CHARS ?-?	OTHER CGI ENVIRONMENT VARS. SEPERATED BY <LF>S
		(ie: "SERVER_NAME: www.nysed.gov<LF>")


CGI ENVIRONMENT VARIABLE FIELDS THAT MAY BE INCLUDED:

ENV. VARIABLES		SAMPLE VALUES		COMMENTS
--------------		-------------		--------
SERVER_SOFTWARE		NYSED-A-Series/2.0E BETA
SERVER_NAME		www.nysed.gov		AS CONFIGURED
SERVER_PORT		80			AS CONFIGURED
SERVER_PROTOCOL		HTTP/1.1		FROM SERVER
REQUEST_METHOD		GET			(OR POST)
SCRIPT_NAME		/coms/testapp		FROM REQUEST
QUERY_STRING		name=Joe+Smith		MAX 255 CHARS
PATH_INFO		/MYTRANCODE		FROM REQUEST
REMOTE_HOST		ppp1.nysed.gov		BLANK IF NOT AVAILABLE
REMOTE_ADDR		140.34.166.22		FROM REQUEST
REMOTE_PROTOCOL		HTTP/1.0		FROM REQUEST
HEADER_LENGTH		823			FROM REQUEST
CONTENT_TYPE		application/x-www-form-urlencoded
CONTENT_LENGTH *	54			FROM REQUEST
TRANSFER_ENCODING *	chunked			FROM REQUEST
ORIG_HOST		ASERIESNAME.		FROM A-SERIES

* CONTENT_LENGTH or TRANSFER_ENCODING will be present, but NOT both.
When TRANSFER_ENCODING is present, the contents are chunked which
is a new feature of HTTP/1.1 and means that the client did not provide
a content length in the request header.  Only after the CGILIB reads
the chunked contents can it determine the actual size.  (Or determine
that the content length exceeds the size of the available array space.)


RESPONSE MESSAGES FROM THE CGI APPLICATION VIA THE CGI PORT:

The CGI app must send a short header which may be followed by
contents such as an HTML document.


RESPONSE HEADER FIELDS:

These are sent from the CGI app to the request handler
preceeding any contents such as an HTML document.  The response
header fields recognized by the NYSED web server are: 

	Content-Type:		REQUIRED if there are contents
	Content-Length:		OPTIONAL if there are contents
	Location:		Results in a "302 Move Temp" response
	Status:			**Now supported**

The examples that follow, show how this is accomplished:

1.  Content-Type: text/html<CR><LF><CR><LF>HTML Document follows...
2.  Location: http://www.nysed.gov/some/other/doc.html<CR><LF><CR><LF>
3.  Status: 205 Reset Content<CR><LF><CR><LF>

A pair of <CR><LF>s terminates the response header as per the
HTTP specs.  If the CGI app does not send a response header to the
server, the server will guarantee that an error response is sent
to the client.


HOW THE NYSED WEB SERVER HANDLES RESPONSE HEADER FIELDS:

The response header fields provided by the CGI applications determine
how the web server handles the response.  The web server checks for the
header fields in the following order:

 1.	If the 'Status' field is present then that is used to indicate
	the result to the client.  The web server allows contents
	to follow, but it is up to the client application to know
	for which status codes that contents are (and are not)
	apropriate.  Also there must be a 'Content-Type' header field
	included if there are contents.

 2.	Otherwise, if the 'Location' field is present then a
	'302 Move Temp' result is returned.  The web server allows
	contents to follow in which case there must be a 'Content-Type'
	header field included.

 3.	Otherwise, if the 'Content-Type' field is present then
	a '200 OK' response is sent to the client with the contents
	that are expected to follow.  An additional 'Content-Length'
	header field is beneficial but optional.

 4.	If none of the above fields are received then the web server
	returns a '501 Service Unavailable' response after the timeout
	configured in the WWWSERV/COMS/CONFIG file.

CGI TIMEOUTS:

The overall transaction timeout is determined by the Timeout setting
configured in the WWWSERV/CONFIG file.
The Application Timeout configured in COMS/CONFIG file is used as an
'inactive' timeout for the CGI app.


CHUNKING:

Chunking is a new feature of HTTP/1.1.  Its main purpose is to allow
the sender of contents (client or server) to notify the recipient when
the transfer is complete for the situations where the sender does not
know ahead of time what the content length will be.  This is the case
for most CGI requests.

Briefly, chunking uses a format where the chunk length of each chunk
(block) preceeds the chunk data, and the last chunk of data is followed
by a zero length chunk to indicate completion.

Prior to chunking the sender's only option to indicate completion was
to close the connection.  This prevented the possibility of keeping the
connection alive for additional requests.

**NOTE!**
Existing CGI applications using the CGI_FORM and WRITE_PORT
services of the CGILIB do *not* require any changes to support chunking.



The NYSED A-Series CGI interface handles REQUEST chunking as follows:

 1.	The web server determines that the request contents on a POST
	request are chunked by the presence of a request header field
	received from the client:

		Transfer-Encoding: chunked

 2.	The web server indictes this to the CGI application by including
	the following environment field in the COMS message:

		TRANSFER_ENCODING: chunked

 3.	When the CGI application opens the CGIPORT file, the web server
	writes the request header to the port file.

 4.	The web server then writes the chunked contents to the CGIPORT
	file including the chunk headers and the final zero length chunk.
	The chunks are not modified except for conversion to EBCDIC
	as follows:

	The chunk headers, which include the chunk sizes are converted
	from ASCII to EBCDIC.

	The chunk data is converted from ASCII to EBCDIC if configured
	in the WWWSERV/COMS/CONFIG file, otherwise the original data is
	passed as 'binary' data.

	The end of each chunk (a CR and LF) is converted to EBCDIC.

 5.	The CGI application is required to interpret the chunked input.
	(* Note that the CGILIB will do this when the CGI_FORM Service
	is used by the application.)  Because of the nature of chunking,
	the application has no way to know the size of the contents
	until AFTER they have been read in.  This makes it impossible to
	know ahead of time the array size needed to store the contents.


The NYSED A-Series CGI interface handles RESPONSE chunking as follows:

 1.	The CGI application writes the response to the CGIPORT the
	same way that it did prior to chunking.**

 2.	The web server knows from the HTTP version of the client whether
	or not the client can support chunking.  If the client can support
	chunking (HTTP/1.1), the web server chunks the output from the
	CGI application and sends the contents to the client.

 3.	If the output has been chunked, the web server will keep the
	connection open in anticipation of another request from the client.
	If the output has not been chunked, the web server will close the
	connection as was done previously with HTTP/1.0.

**	If the CGI application includes a Content-Length header field
	in the response, then the web server does *not* chunk the
	contents and does keep the connection open if the client has
	indicated the capablity of sending multiple requests on the
	current connection.